The Limiting Distribution of a Test for Multivariate Structure
نویسندگان
چکیده
We define a chi-squared statistic for p-dimensional data as follows. First, we transform the data to remove the correlations between the p variables. Then we discretize each variable into groups of equal size and compute the cell counts in the resulting p-way contingency table. Our statistic is just the usual chi-squared statistic for testing independence in a contingency table. Because the cells have been chosen in a data-dependent manner, this statistic does not have the usual limiting distribution. We derive the limiting joint distribution of the cell counts and the limiting distribution of the chi-squared statistic when the data is sampled from a multivariate normal distribution. The chi-squared statistic is useful in detecting hidden structure in raw data or residuals. It can also be used as a test for multivariate normality. AMS 1991 subject classifications: 62H15, 62E20, 62H20
منابع مشابه
Limiting Properties of Empirical Bayes Estimators in a Two-Factor Experiment under Inverse Gaussian Model
The empirical Bayes estimators of treatment effects in a factorial experiment were derived and their asymptotic properties were explored. It was shown that they were asymptotically optimal and the estimator of the scale parameter had a limiting gamma distribution while the estimators of the factor effects had a limiting multivariate normal distribution. A Bootstrap analysis was performed to ill...
متن کاملA Test for Multivariate Structure
We present a test for detecting`multivariate structure' in data sets. This procedure consists of transforming the data to remove the correlations, then discretizing the data and nally, studying the cell counts in the resulting contingency table. A formal test can be performed using the usual chi-squared test statistic. We give the limiting distribution of the chi-squared statistic and also pres...
متن کاملOn the multivariate variation control chart
Multivariate control charts such as Hotelling`s T^ 2 and X^ 2 are commonly used for monitoring several related quality characteristics. These control charts use correlation structure that exists between quality characteristics in an attempt to improve monitoring. The purpose of this article is to discuss some issues related to the G chart proposed by Levinson et al. [9] for detecting shifts in ...
متن کاملCompact Suffix Trees Resemble PATRICIA Tries: Limiting Distribution of the Depth
Suffix trees are the most frequently used data structures in algorithms on words. In this paper, we consider the depth of a compact suffix tree, also known as the PAT tree, under some simple probabilistic assumptions. For a biased memoryless source, we prove that the limiting distribution for the depth in a PAT tree is the same as the limiting distribution for the depth in a PATRICIA trie, even...
متن کاملOutlier test for a group of multivariate observations
Assume that we have m independent random samples each of size n from Np(; ) and our goal is to test whether or not the ith sample is an outlier (i=1,2,…..m). To date it is well known that a test statistics exist whose null distribution is Betta and given the relationship between Betta and F distribution, an F test statistic can be used. In the statistical literature however a clear and preci...
متن کامل